-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28701][test-hadoop3.2][test-java11][k8s] adding java11 support for pull request builds #25423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #108987 has started for PR 25423 at commit |
|
from the console output: looks like we should be good! i'll let the build run and double-check everything when it's done. |
|
test this please |
i manually killed the initial test because i wanted to make sure that any k8s-based tests won't be affected by this change... the prb and k8s tests run on different machines (centos vs ubuntu) and while i am certain they won't have any JAVA_HOME collisions it's cheaper to test and be sure. |
|
Kubernetes integration test starting |
|
Thank you, @shaneknapp ! |
|
Test build #108988 has finished for PR 25423 at commit
|
|
oof... this is definitely out of scope of this particular PR, but needs to be addressed. since my java skills are rather basic i could really use some help here. |
|
Kubernetes integration test status success |
|
test this please |
|
Kubernetes integration test starting |
|
Test build #108989 has finished for PR 25423 at commit
|
|
Seems scala 2.12.8 is not completely compatible to JDK 11? https://docs.scala-lang.org/overviews/jdk-compatibility/overview.html#jdk-11-compatibility-notes |
|
need to run the k8s test again... |
|
test this please |
ugh. :\ |
@srowen any ideas here? |
|
2.12.8 should work fine enough for Spark's purposes, or at least, we aren't seeing any test failures in all but one module, for a long time now. The java module system thing is not something Spark uses. |
|
Kubernetes integration test starting |
well, it might be blocking this PR (SPARK-27365), and every single java11 build on jenkins is currently broken and failing hive integration tests. |
|
Test build #108991 has finished for PR 25423 at commit
|
|
Yes, Spark does not yet pass tests on Java 11, because of Hive-related issues. That's the last big chunk of work. It's not a scala issue though. |
@srowen -- it doesn't seem to be just Hive-related issues... testing this PRB against java11 also shows that it's both failing the java/scala unidoc section and the k8s integration tests. i guess the TL;DR here is twofold:
|
|
Kubernetes integration test status failure |
|
It's possible; I am not sure for example https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/252/console reaches the scaladoc phase. Hm, I wasn't aware that one wasn't checking K8S. Let me at least add these to the umbrella. I bet we can solve both without too much trouble. |
|
retest this please |
1 similar comment
|
retest this please |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #109685 has finished for PR 25423 at commit
|
|
Test build #109686 has finished for PR 25423 at commit
|
| if "test-java11" in os.environ["ghprbPullTitle"].lower(): | ||
| os.environ["JAVA_HOME"] = "/usr/java/jdk-11.0.1" | ||
| os.environ["PATH"] = "%s/bin:%s" % (os.environ["JAVA_HOME"], os.environ["PATH"]) | ||
| test_profiles += ['-Djava.version=11'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to set this in python tests too? Seems like Java gateway has to use JDK 11 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should use Java 11 if the path provides Java 11 and the test harness that runs Python tests does too. At least I don't know how else one would tell pyspark what to use!
In fact I'm pretty sure the test failure here shows that it is using JDK 11. From JPMML: java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory This would be caused by JDK 11 changes. However, I don't get why all the other non-Python tests don't fail.
Given the weird problem in #24651 I am wondering if we have some subtle classpath issues with how the Pyspark tests are run.
This one however might be more directly solvable by figuring out what is suggesting to use this old Sun JAXB implementation. I'll start digging around META-INF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, and why does https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/ pass then? it is doing the same thing in the Jenkins config. (OK I think I answered my own question below)
EDIT: Oh, because it doesn't run Pyspark tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, actually you're right. Yes, seems after Scala tests here, the PATH and JAVA_HOME still set as are.
I thought:
spark/python/pyspark/java_gateway.py
Lines 45 to 60 in 209b936
| SPARK_HOME = _find_spark_home() | |
| # Launch the Py4j gateway using Spark's run command so that we pick up the | |
| # proper classpath and settings from spark-env.sh | |
| on_windows = platform.system() == "Windows" | |
| script = "./bin/spark-submit.cmd" if on_windows else "./bin/spark-submit" | |
| command = [os.path.join(SPARK_HOME, script)] | |
| if conf: | |
| for k, v in conf.getAll(): | |
| command += ['--conf', '%s=%s' % (k, v)] | |
| submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "pyspark-shell") | |
| if os.environ.get("SPARK_TESTING"): | |
| submit_args = ' '.join([ | |
| "--conf spark.ui.enabled=false", | |
| submit_args | |
| ]) | |
| command = command + shlex.split(submit_args) |
| args.mainClass = "org.apache.spark.api.python.PythonGatewayServer" |
Here somehow happened to use JDK 8.
Actually the PySpark tests and SparkR tests passed at #25443 (comment)
So, the issue persists here .. but I guess yes we can do it separately since at least this PR seems setting JDK 11 correctly, and it virtually doesn't affect any main or test code (if this title is not used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting. Thank you for the investigation, @srowen and @HyukjinKwon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a JIRA issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need one, yeah, regardless of the cause. I'll file one to track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
I personally think this is OK to merge simply because we need a way to test JDK 11, and this seems to do that. The rest of the error is orthogonal. So, in order to use this in a JDK 11 Jenkins build, how would one configure the Jenkins job? it is only triggering off the PR title (which is also useful). OK if that's a future step. |
Yes, same conclusion |
|
Merged to master. |
What changes were proposed in this pull request?
we need to add the ability to test PRBs against java11.
see comments here: #25405
How was this patch tested?
the build system will test this.